Single Image 3D Interpreter Network
نویسندگان
چکیده
Understanding 3D object structure from a single image is an important but difficult task in computer vision, mostly due to the lack of 3D object annotations in real images. Previous work tackles this problem by either solving an optimization task given 2D keypoint positions, or training on synthetic data with ground truth 3D information. In this work, we propose 3D INterpreter Network (3D-INN), an endto-end framework which sequentially estimates 2D keypoint heatmaps and 3D object structure, trained on both real 2D-annotated images and synthetic 3D data. This is made possible mainly by two technical innovations. First, we propose a Projection Layer, which projects estimated 3D structure to 2D space, so that 3D-INN can be trained to predict 3D structural parameters supervised by 2D annotations on real images. Second, heatmaps of keypoints serve as an intermediate representation connecting real and synthetic data, enabling 3D-INN to benefit from the variation and abundance of synthetic 3D objects, without suffering from the difference between the statistics of real and synthesized images due to imperfect rendering. The network achieves state-of-the-art performance on both 2D keypoint estimation and 3D structure recovery. We also show that the recovered 3D information can be used in other vision applications, such as image retrieval.
منابع مشابه
A Deep Model for Super-resolution Enhancement from a Single Image
This study presents a method to reconstruct a high-resolution image using a deep convolution neural network. We propose a deep model, entitled Deep Block Super Resolution (DBSR), by fusing the output features of a deep convolutional network and a shallow convolutional network. In this way, our model benefits from high frequency and low frequency features extracted from deep and shallow networks...
متن کاملA hierarchical Convolutional Neural Network for Segmentation of Stroke Lesion in 3D Brain MRI
Introduction: Brain tumors such as glioma are among the most aggressive lesions, which result in a very short life expectancy in patients. Image segmentation is highly essential in medical image analysis with applications, particularly in clinical practices to treat brain tumors. Accurate segmentation of magnetic resonance data is crucial for diagnostic purposes, planning surgical treatments, a...
متن کامل3D Reconstruction of Simple Objects from A Single View Silhouette Image
While recent deep neural networks have achieved promising results for 3D reconstruction from a single-view image, these rely on the availability of RGB textures in images and extra information as supervision. In this work, we propose novel stacked hierarchical networks and an end to end training strategy to tackle a more challenging task for the first time, 3D reconstruction from a single-view ...
متن کاملA hierarchical Convolutional Neural Network for Segmentation of Stroke Lesion in 3D Brain MRI
Introduction: Brain tumors such as glioma are among the most aggressive lesions, which result in a very short life expectancy in patients. Image segmentation is highly essential in medical image analysis with applications, particularly in clinical practices to treat brain tumors. Accurate segmentation of magnetic resonance data is crucial for diagnostic purposes, planning surgical treatments, a...
متن کاملUnsupervised 3D Reconstruction from a Single Image via Adversarial Learning
Recent advancements in deep learning opened new opportunities for learning a high-quality 3D model from a single 2D image given sufficient training on large-scale data sets. However, the significant imbalance between available amount of images and 3D models, and the limited availability of labeled 2D image data (i.e. manually annotated pairs between images and their corresponding 3D models), se...
متن کامل